In medicine and health-related fields, a reference range or reference interval is the range or the interval of values that is deemed normal for a physiology measurement in healthy persons (for example, the amount of creatinine in the blood, or the partial pressure of oxygen). It is a basis for comparison for a physician or other health professional to interpret a set of test results for a particular patient. Some important reference ranges in medicine are reference ranges for blood tests and reference ranges for urine tests.
The standard definition of a reference range (usually referred to if not otherwise specified) originates in what is most prevalent in a reference group taken from the general (i.e. total) population. This is the general reference range. However, there are also optimal health ranges (ranges that appear to have the optimal health impact) and ranges for particular conditions or statuses (such as pregnancy reference ranges for hormone levels).
Values within the reference range ( WRR) are those within normal limits ( WNL). The limits are called the upper reference limit (URL) or upper limit of normal (ULN) and the lower reference limit (LRL) or lower limit of normal (LLN). In health care–related publishing, style guide sometimes prefer the word reference over the word normal to prevent the nontechnical word sense of normal from being conflated with the statistical sense. Values outside a reference range are not pathologic, and they are not necessarily abnormal in any sense other than statistically. Nonetheless, they are indicators of probable pathosis. Sometimes the underlying cause is obvious; in other cases, challenging differential diagnosis is required to determine what is wrong and thus how to treat it.
A cutoff or threshold is a limit used for binary classification, mainly between normal versus pathological (or probably pathological). Establishment methods for cutoffs include using an upper or a lower limit of a reference range.
Reference ranges that are given by this definition are sometimes referred as standard ranges.
Since a range is a defined statistical value (Range (statistics)) that describes the interval between the smallest and largest values, many, including the International Federation of Clinical Chemistry prefer to use the expression reference interval rather than reference range.
Regarding the target population, if not otherwise specified, a standard reference range generally denotes the one in healthy individuals, or without any known condition that directly affects the ranges being established. These are likewise established using reference groups from the healthy population, and are sometimes termed normal ranges or normal values (and sometimes "usual" ranges/values). However, using the term normal may not be appropriate as not everyone outside the interval is abnormal, and people who have a particular condition may still fall within this interval.
However, reference ranges may also be established by taking samples from the whole population, with or without diseases and conditions. In some cases, diseased individuals are taken as the population, establishing reference ranges among those having a disease or condition. Preferably, there should be specific reference ranges for each subgroup of the population that has any factor that affects the measurement, such as, for example, specific ranges for each sex, age group, race or any other general determinant.
where is the 97.5% quantile of a Student's t-distribution with n−1 degrees of freedom.
When the sample size is large ( n≥30)
This method is often acceptably accurate if the standard deviation, as compared to the mean, is not very large. A more accurate method is to perform the calculations on logarithmized values, as described in separate section later.
The following example of this ( not logarithmized) method is based on values of fasting plasma glucose taken from a reference group of 12 subjects: Table 1. Subject characteristics in:
0.029 |
0.017 |
0.017 |
0.221 |
0.073 |
0.533 |
0.073 |
0.325 |
0.397 |
0.109 |
0.137 |
0.017 |
Sum/( n−1) = 1.95/11 =0.18 = '''standard deviation (s.d.)''' |
As can be given from, for example, a table of selected values of Student's t-distribution, the 97.5% percentile with (12-1) degrees of freedom corresponds to
Subsequently, the lower and upper limits of the standard reference range are calculated as:
Thus, the standard reference range for this example is estimated to be 4.4 to 6.3 mmol/L.
where SD is the standard deviation, and n is the number of samples.
Taking the example from the previous section, the number of samples is 12 and the standard deviation is 0.42 mmol/L, resulting in:
Thus, the lower limit of the reference range can be written as 4.4 (90% CI 4.1–4.7) mmol/L.
Likewise, with similar calculations, the upper limit of the reference range can be written as 6.3 (90% CI 6.0–6.6) mmol/L.
These confidence intervals reflect random error, but do not compensate for systematic error, which in this case can arise from, for example, the reference group not having fasted long enough before blood sampling.
As a comparison, actual reference ranges used clinically for fasting plasma glucose are estimated to have a lower limit of approximately 3.8Last page of to 4.0,Reference range list from Uppsala University Hospital ("Laborationslista"). Artnr 40284 Sj74a. Issued on April 22, 2008 and an upper limit of approximately 6.0 to 6.1.
An explanation for this log-normal distribution for biological parameters is: The event where a sample has half the value of the mean or median tends to have almost equal probability to occur as the event where a sample has twice the value of the mean or median. Also, only a log-normal distribution can compensate for the inability of almost all biological parameters to be of (at least when measured on ), with the consequence that there is no definite limit to the size of outliers (extreme values) on the high side, but, on the other hand, they can never be less than zero, resulting in a positive skewness.
As shown in diagram at right, this phenomenon has relatively small effect if the standard deviation (as compared to the mean) is relatively small, as it makes the log-normal distribution appear similar to a normal distribution. Thus, the normal distribution may be more appropriate to use with small standard deviations for convenience, and the log-normal distribution with large standard deviations.
In a log-normal distribution, the geometric standard deviations and geometric mean more accurately estimate the 95% prediction interval than their arithmetic counterparts.
The necessity to establish a reference range by log-normal distribution rather than normal distribution can be regarded as depending on how much difference it would make to not do so, which can be described as the ratio:
where:
This difference can be put solely in relation to the coefficient of variation, as in the diagram at right, where:
where:
In practice, it can be regarded as necessary to use the establishment methods of a log-normal distribution if the difference ratio becomes more than 0.1, meaning that a (lower or upper) limit estimated from an assumed normal distribution would be more than 10% different from the corresponding limit as estimated from a (more accurate) log-normal distribution. As seen in the diagram, a difference ratio of 0.1 is reached for the lower limit at a coefficient of variation of 0.213 (or 21.3%), and for the upper limit at a coefficient of variation at 0.413 (41.3%). The lower limit is more affected by increasing coefficient of variation, and its "critical" coefficient of variation of 0.213 corresponds to a ratio of (upper limit)/(lower limit) of 2.43, so as a rule of thumb, if the upper limit is more than 2.4 times the lower limit when estimated by assuming normal distribution, then it should be considered to do the calculations again by log-normal distribution.
Taking the example from previous section, the standard deviation (s.d.) is estimated at 0.42 and the arithmetic mean (m) is estimated at 5.33. Thus the coefficient of variation is 0.079. This is less than both 0.213 and 0.413, and thus both the lower and upper limit of fasting blood glucose can most likely be estimated by assuming normal distribution. More specifically, the coefficient of variation of 0.079 corresponds to a difference ratio of 0.01 (1%) for the lower limit and 0.007 (0.7%) for the upper limit.
The following example of this method is based on the same values of fasting plasma glucose as used in the previous section, using e as a base:
Subsequently, the still logarithmized lower limit of the reference range is calculated as:
and the upper limit of the reference range as:
Conversion back to non-logarithmized values are subsequently performed as:
Thus, the standard reference range for this example is estimated to be 4.4 to 6.4.
By assuming that the expected value can represent the arithmetic mean in this case, the parameters μlog and σlog can be estimated from the arithmetic mean ( m) and standard deviation ( s.d.) as:
Following the exampled reference group from the previous section:
Subsequently, the logarithmized, and later non-logarithmized, lower and upper limit are calculated just as by logarithmized sample values.
This method can be used even when measurement values do not appear to conform conveniently to any form of normal distribution or other function.
However, the reference range limits as estimated in this way have higher variance, and therefore less reliability, than those estimated by an arithmetic or log-normal distribution (when such is applicable), because the latter ones acquire statistical power from the measurements of the whole reference group rather than just the measurements at the 2.5th and 97.5th percentiles. Still, this variance decreases with increasing size of the reference group, and therefore, this method may be optimal where a large reference group easily can be gathered, and the distribution mode of the measurements is uncertain.
Such further consideration can be performed, for example, by an epidemiology-based differential diagnostic procedure, where potential candidate conditions are listed that may explain the finding, followed by calculations of how probable they are to have occurred in the first place, in turn followed by a comparison with the probability that the result would have occurred by random variability.
If the establishment of the reference range could have been made assuming a normal distribution, then the probability that the result would be an effect of random variability can be further specified as follows:
The standard deviation, if not given already, can be inversely calculated by the fact that the absolute value of the difference between the mean and either the upper or lower limit of the reference range is approximately 2 standard deviations (more accurately 1.96), and thus:
The standard score for the individual's test can subsequently be calculated as:
The probability that a value is of a certain distance from the mean can subsequently be calculated from the relation between standard score and prediction intervals. For example, a standard score of 2.58 corresponds to a prediction interval of 99%, Page 111 in: corresponding to a probability of 0.5% that a result is at least such far from the mean in the absence of disease.
In this case, an epidemiology-based differential diagnostic procedure is used, and its first step is to find candidate conditions that can explain the finding.
Hypercalcemia (usually defined as a calcium level above the reference range) is mostly caused by either primary hyperparathyroidism or malignancy,Table 20-4 in:
Using for example epidemiology and the individual's risk factors, let's say that the probability that the hypercalcemia would have been caused by primary hyperparathyroidism in the first place is estimated to be 0.00125 (or 0.125%), the equivalent probability for cancer is 0.0002, and 0.0005 for other conditions. With a probability given as less than 0.025 of no disease, this corresponds to a probability that the hypercalcemia would have occurred in the first place of up to 0.02695. However, the hypercalcemia has occurred with a probability of 100%, resulting adjusted probabilities of at least 4.6% that primary hyperparathyroidism has caused the hypercalcemia, at least 0.7% for cancer, at least 1.9% for other conditions and up to 92.8% for that there is no disease and the hypercalcemia is caused by random variability.
In this case, further processing benefits from specification of the probability of random variability:
The value is assumed to conform acceptably to a normal distribution, so the mean can be assumed to be 1.15 in the reference group. The standard deviation, if not given already, can be inversely calculated by knowing that the absolute value of the difference between the mean and, for example, the upper limit of the reference range, is approximately 2 standard deviations (more accurately 1.96), and thus:
The standard score for the individual's test is subsequently calculated as:
The probability that a value is of so much larger value than the mean as having a standard score of 3 corresponds to a probability of approximately 0.14% (given by , with 99.7% here being given from the 68–95–99.7 rule).
Using the same probabilities that the hypercalcemia would have occurred in the first place by the other candidate conditions, the probability that hypercalcemia would have occurred in the first place is 0.00335, and given the fact that hypercalcemia has occurred gives adjusted probabilities of 37.3%, 6.0%, 14.9% and 41.8%, respectively, for primary hyperparathyroidism, cancer, other conditions and no disease.
It may be more appropriate to use for e.g. folate, since approximately 90 percent of North Americans may actually suffer more or less from folate deficiency, Folic Acid: Don't Be Without It! by Hans R. Larsen, MSc ChE, retrieved on July 7, 2009. In turn citing:
but only the 2.5 percent that have the lowest levels will fall below the standard reference range. In this case, the actual folate ranges for optimal health are substantially higher than the standard reference ranges. Vitamin D has a similar tendency. In contrast, for e.g. uric acid, having a level not exceeding the standard reference range still does not exclude the risk of getting gout or kidney stones. Furthermore, for most , the standard reference range is generally lower than the level of toxic effect.
A problem with optimal health range is a lack of a standard method of estimating the ranges. The limits may be defined as those where the health risks exceed a certain threshold, but with various risk profiles between different measurements (such as folate and vitamin D), and even different risk aspects for one and the same measurement (such as both deficiency and toxicity of vitamin A) it is difficult to standardize. Subsequently, optimal health ranges, when given by various sources, have an additional variability caused by various definitions of the parameter. Also, as with standard reference ranges, there should be specific ranges for different determinants that affects the values, such as sex, age etc. Ideally, there should rather be an estimation of what is the optimal value for every individual, when taking all significant factors of that individual into account - a task that may be hard to achieve by studies, but long clinical experience by a physician may make this method preferable to using reference ranges.
They may represent both standard ranges and optimal health ranges. Also, they may represent an appropriate value to distinguish healthy person from a specific disease, although this gives additional variability by different diseases being distinguished. For example, for NT-proBNP, a lower cut-off value is used in distinguishing healthy babies from those with acyanotic heart disease, compared to the cut-off value used in distinguishing healthy babies from those with congenital nonspherocytic anemia. Screening for Congenital Heart Disease with NT-proBNP: Results By Emmanuel Jairaj Moses, Sharifah A.I. Mokhtar, Amir Hamzah, Basir Selvam Abdullah, and Narazah Mohd Yusoff. Laboratory Medicine. 2011;42(2):75–80. American Society for Clinical Pathology
Also, reference ranges tend to give the impression of definite thresholds that clearly separate "good" or "bad" values, while in reality there are generally continuously increasing risks with increased distance from usual or optimal values.
With this and uncompensated factors in mind, the ideal interpretation method of a test result would rather consist of a comparison of what would be expected or optimal in the individual when taking all factors and conditions of that individual into account, rather than strictly classifying the values as "good" or "bad" by using reference ranges from other people.
In a recent paper, Rappoport et al. described a novel way to redefine reference range from an electronic health record system. In such a system, a higher population resolution can be achieved (e.g., age, sex, race and ethnicity-specific).
Log-normal distribution
Necessity
From logarithmized sample values
0.000841 0.000441 0.000441 0.007921 0.002401 0.019881 0.002401 0.009801 0.014641 0.003721 0.004761 0.000441 Sum/(n-1) : 0.068/11 = 0.0062
= '''standard deviation of loge(FPG)'''
(''σ''log)
&= 1.67 - 2.20\times\sqrt{\frac{13}{12}} \times 0.079 = 1.49, \end{align}
&= 1.67 + 2.20\times\sqrt{\frac{13}{12}} \times 0.079 = 1.85 \end{align}
From arithmetic mean and variance
Directly from percentages of interest
Bimodal distribution
Interpretation of standard ranges in medical tests
Probability of random variability
Example
Optimal health range
One-sided cut-off values
General drawbacks
Examples
See also
Further reading
|
|